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Abstract 


Hub Labeling (HL) is a data structure for distance oracles. Hierarchical HL (HHL) is a special 
type of HL, that received a lot of attention from a practical point of view. However, theoretical 
questions such as NP-hardness and approximation guarantee for HHL algorithms have been left 
aside. In this paper we study HL and HHL from the complexity theory point of view. We prove that 
both HL and HHL are NP-hard, and present upper and lower bounds for the approximation ratios 
of greedy HHL algorithms used in practice. We also introduce a new variant of the greedy HHL 
algorithm and a proof that it produces small labels for graphs with small highway dimension. 

1 Introduction 

The point-to-point shortest path problem is a classical optimization problem with many applications. The 
input to the problem is a graph G = {V,E), a length function : if —>• K, and a pair s,t £ V. We define 
n = \V\ and m = \E\. The goal is to find dist(s,t), the length of the shortest s-t path in G, where the 
length of a path is the sum of the lengths of its arcs. We assume that the length function is non-negative 
and that there are no zero-length cycles. 

The hub labeling algorithm (HL) [51 [T2] is a shortest path algorithm that computes vertex labels 
during preprocessing stage and answers s,t queries using only the labels of s and t; the input graph is 
not used for queries [15]. For a directed graph a label L{v) for a vertex v £V consists of the forward 
label Lfiv) and the backward label Lb(v). The forward label Lf{v) consists of a sequence of pairs 
{w, dist(u, w)), where dist(u, w) is the distance (in G) from v to w. The backward label Lb is similar, with 
pairs (m, dist(it, w)). Vertices w and u (for forward and backward labels, respectively) are called the hubs 
of V. For an undirected graph Lf = Lb, and we denote the labeling by L, so L{v) itself is a set of pairs 
{w, dist(z;, w)). 

The labels must obey the cover property: for any two vertices s and t, the set Lf{s) H Lb{t) must 
contain at least one hub v that is on a shortest s-t path (we say that v covers the [s,i\ pair). Given 
the labels, HL queries are straightforward: to find dist(s,t), simply find the hub v £ Lf{s) n Lb{t) that 
minimizes dist(s, v) + dist(?;, t). 

Query time and space complexity depends on the label size. The size of a label |L(n)| is the 
number of hubs it contains. For a directed graph the size of a forward (backward) label, \Lf{v)\ 
(|Lb(u)|), is the number of hubs it contains and the size of the full label of v, L{v) = {Lf{v),Lb{v)), is 
|L(z:)| = \Lf{v)\ + \Lb{v)\. Unless mentioned otherwise, preprocessing algorithms attempt to minimize 
the total labeling size \L\ — L('^)l- 

Cohen et al. [5] give an O(logn) approximation algorithm for HL preprocessing. This algorithm 
was generalized in [7] and sped up in |10| . These approximation algorithms compute small labels but, 
although polynomial, do not scale to large problems [TO] . 

A special case of HL is hierarchical hub labeling (HHL) [3], where vertices are globally ranked by 
“importance” and the label for a vertex v can only have more important hubs than v and v itself. HHL 
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implementations are faster in practice than general HL ones. For several important graph classes, 
such as road and complex networks, HHL implementations find small labelings and scale to large 
problems [SllliEllg!. However, for the algorithms used in practice such as hierarchical greedy (g-HHL) 
and hierarchical weighted greedy (w-HHL) there was no theoretical guarantee on the approximation ratio. 

Most of the work on the computational complexity of HL (and HHL) algorithms is experimental. The 
exceptions are approximation algorithms for HL mentioned above, and upper bounds for HL in case of low 
highway dimension I3I1II]- However, there was no NP-completeness proof of for HL. NP-completeness 
was implicitly conjectured in [8]: this assumption motivates the 0(log n)-approximation algorithm. In 
addition, in [8] the authors prove that a more general problem, in which the paths to cover are part of 
the input, is NP-complete (which does not imply NP-hardness of the original problem). 

In this paper we obtain the following results on HL and HHL complexity: 

• We show that both the optimal HL and the optimal HHL problems are NP-complete. 

• We show that in a network of highway dimension h and diameter D, there is an HHL such that 
every label size is 0{h\ogD), matching the HL bound of [2l|3l[T]. 

• We propose a variant of the greedy algorithm (called d-HHL), for which we prove 

— an 0{h\ogn\ogD) bound for every label size, 

— an 0(-y/nlognlogII)-approximation ratio compared to the optimal HL (and therefore the 
optimal HHL), 

— an r2(y^) lower bound on the approximation ratio for the optimal HHL. 

• For g-HHL, we prove 

— an 0(y^logn)-approximation ratio compared to the optimal HL. 

— an D.{^/n) lower bound on the approximation ratio for the optimal HHL. 

• For w-HHL, we prove 

— an 0(-y/nlogn)-approximation ratio compared to the optimal HL. 

— H(-y/n) lower bound on the approximation ratio for the optimal HHL. 

• We give an example showing that hierarchical labelings can be n{y/n) bigger than general labelings, 
improving and simplifying m- 

Our lower bounds on the greedy algorithms show that they do not give a poly-log approximation, 
leaving the question of the possibility of poly-log approximation open. This is an interesting theoretical 
problem that may have a practical impact as well. 

2 Preliminaries 

2.1 HL Approximation Algorithm 

Cohen et al. obtain their O(logn) approximation algorithm for HL by formulating it as a weighted set 
cover problem and applying the well known greedy approximation algorithm for set cover. In the weighted 
set cover problem there is a universe set C/, a family J- of some subsets of U, a cost function c : J- ^ K_|_, 
and the goal is to find a collection C QT such that Usees' = U and J2seC minimized. The greedy 

set cover algorithm starts with an empty C, then iteratively picks a set S which maximizes the ratio of 
the number of newly covered elements in U to the cost of S and adds S to C. 

The elements to cover in the equivalent set cover instance are vertex pairs [it, u]. For a directed graph 
pairs in U are ordered and for an undirected graph pairs are unordered. We first discuss directed graphs, 
then undirected ones. Every possible set P of vertex pairs such that there exists a vertex u which hits a 
shortest path between every pair in P is a set. (There are exponentially many sets, but they are not used 
explicitly.) The cost of a set P is the number of vertices that appear in the first component of a pair in P 
plus the number of vertices that appear in the second component of a pair in P. 

The greedy approximation algorithm for set cover as applied to this set cover instance is as follows. 
The algorithm maintains the set U of uncovered vertex pairs: [it, in] € 17 if Ljiu) H L(,{w) does not contain 
a vertex on a shortest u-w path. Initially U contains all vertex pairs [it, w] such that w is reachable from 
It. The algorithm terminates when U becomes empty. Starting with an empty labeling, in each iteration, 
the algorithm adds a vertex v to forward labels of vertices in a set S' C V and to backward labels of 
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the vertices in S” C V such that the ratio of the number of newly-covered pairs over the total increase 
in the size of the labeling is (approximately) maximized. Formally, let U{v, S', S”) be the set of pairs 
in U which are covered if v is added to Lf(u) : u G S' and Lhiw) : w G S". The algorithm maximizes 
|C/(u, S', S'")l/(|5"| + IS"'!) over all n e F and S', S" C V. 

To find the triples (v. S', S") efficiently the algorithm uses center graphs defined as follows. A center 
graph of v, Gy = {X, Y, Ay), is a bipartite graph with X = V,Y = V, and an arc {u, w) G Ay if [u, w] G U 
and some shortest path from m to w goes through v. The algorithm finds {v. S', S") that maximizes 
\U{v, S', S'")|/(|S"| -I-by computing a densest subgraph among all the subgraphs of the center graphs 
Gy. The density of a graph G = {V,A) is |A|/|F|. The maximum density subgraph (MDS) problem is the 
problem of finding an (induced) subgraph of a given graph G of maximum density. This problem can be 
solved in polynomial time using parametric flows (e.g., El). For a vertex v, the arcs of a subgraph of Gy 
induced hy S' G X and S" C Y correspond to the pairs of vertices in U that become covered if v is added 
to Lf{u) ■. u G S' and Lb{w) : w e S". Therefore, the MDS of Gy maximizes |f7(u, S", «S'")|/(|S''| -I- IS"']) 
over all S', S". 

For undirected graphs we have Lf{v) = Lf,{v) = L{v) by definition. Pairs [n,f] G U are unordered 
and the cost of a set P of unordered vertex pairs is the number of vertices that appear in a pair in P. Let 
U{v, S) be the set of unordered vertex pairs that become covered if we add v to L{u) : u G S. We want to 
maximize U{v, S')/|S'|. To find such a tuple, we use another type of a center graph of v, Gy = {V, Ey). Gy 
is an undirected graph with vertex set V and with an edge {m, ui} G Ey if [u, w] G U and some shortest 
path between u and w goes trough v. (For a pair [?;,?;] there is a self-loop {u,u} in Ey.) Note that Gy is 
not necessarily bipartite. As in the directed case, MDS of Gy maximizes f7(?;, S')/|5'| over all S. 

The following is a folklore lemma about the greedy set cover algorithm. 

Lemma 2.1. If we run the greedy set eover algorithm where in each iteration we pick a set whose eoverage 
to cost ratio is at least l//(n) fraction of the maximum coverage to cost ratio, then we get a cover of cost 
within an 0{f{n) logn) factor of optimal. 

Cohen et al. [8j used this lemma and instead of finding the MDS exactly they used a linear-time 
2-approximation algorithm |14j . The result is an 0(logn)-approximation algorithm running in 0{n^) 
time. Delling et al. m improve the running time to 0(n^ logn). 


2.2 Canonical HHL 

Vertices are ordered if there is a bijection tt : F —>■ {1,..., |F|}. We say that u is more important than v 
if 7r(u) < 7r(u). The labeling L is hierarchical if there is an order tt such that u G Lf{v) U Li,(v) implies 
tt{u) < 7r(w). In this case we say that L respects tt. 

Let Pu,y denote the set of all vertices on shortest paths from u to v. For an order tt we define a 
canonical HHL in the following way: u G Lf{v) (resp. u G Lt,(v)) if and only if u is the most important 
vertex in Py y (resp. Pu,y)- The following theorem is implicit in [111^IT5|. 

Theorem 2.2. For an order tt the canonical HHL is the minimum HHL that respects tt. 

Proof. We first show that the canonical HHL L obeys the cover property. For a pair let u be the 

most important vertex in Py^y,. Consider any v-u shortest path. It is easy to see that it is a subpath 
of some v-w shortest path. Therefore by the definition of canonical labeling we have u G Lf{v) and 
u G Lb{w). 

Now we show that L is a sublabeling of any HHL L that respects tt. Let u G Lf{v) (resp. u G L/,(v)). 
Then u is more important than any other vertex n; on a v-u (resp. u-v) shortest path. Therefore Lt,(u) 
(resp. Lf{u)) doesn’t have any such w except u. Since L covers the [t’jM] (resp. [it,'c]) pair we have 
u G Lf{v) (resp. u G Lb{v)). So L is a sublabeling of L. □ 


2.3 Greedy HHL Algorithms 


In this section we describe greedy HHL algorithms in terms of center graphs. For an alternative description 
and efficient implementation of these algorithms, see mi- 

At 


A greedy HHL algorithm maintains the center graphs G„ = {X,Y,Ay) defined in Section 2.1 


each iteration, the algorithm selects a center graph of a vertex v and adds v to Lf{u) for all non-isolated 
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vertices u G X and to Li,(w) for all non-isolated vertices w G Y. Note that after the labels are augmented 
this way, all vertex pairs [u, w] for which there is a u-w shortest path passing through v are covered. 
Therefore, the center graph of every vertex is chosen once, and the labeling is hierarchical. 

Greedy algorithms differ by the criteria used to select the next center graph to process. The greedy 
HHL (g-HHL) algorithm selects the center graph with most edges. The weighted greedy HHL (w-HHL) 
algorithm selects a center graph with the highest density (the number of edges divided by the number of 
non-isolated vertices). 

We propose a new distance greedy HHL (d-HHL) algorithm. To every vertex pair [u, u] we assign a 
weight 

fo, if dist(u,u) = 0 

|^ 2 Liog 2 (dist(«,-u))j^ otherwise 

and use W to weight the corresponding edges in center graphs. At each iteration, d-HHL selects a center 
graph with the largest sum of edge weights. 

We define the level of [u,v\ as [log 2 (dist(?i,u))J (if dist(u,u) = 0 the level of [u,u] is —oo). The 
definition of W insures that if [u, v] is the maximum level uncovered vertex pair, W (u, v) is greater than 
the total weight of all lower-level uncovered pairs. Therefore d-HHL primarily maximizes the number 
of uncovered maximum level pairs that become covered, and other pairs that become covered are used 
essentially as tie-breakers. 

We say that a vertex w has level i if at the iteration when w is selected by d-HHL, the maximum level 
of an uncovered vertex pair is i. As the algorithm proceeds, the levels of vertices it selects are monotony 
decreasing. 

2.4 Highway Dimension 

In this section we review the definition of highway dimension (HD) and related concepts. As HD is defined 
for undirected graphs, when we talk about HD we assume that all graphs are undirected and connected. 

Definition 2.3. Given a shortest path P = (ui,..., Vk) and r > 0, a shortest path P' is an r-witness for 
P if and only if i{P') > r and one of the following conditions holds: 

1. P' = P; or 

2. P' = {vo,vi,.. .,Vk); or 

3. P' = (vi ,.. .,Vk,Vk+i); or 
P' = {vo,Vi,. . .,Vk,Vk+l). 

Definition 2.4. A shortest path P is r-significant if it has an r-witness path. 

Let Vr denote the set of all r-significant paths. Given a vertex v and a path P, we define the distance 
from u to P by dist(u, P) = min.u,g p dist(u, w). 

Definition 2.5. A shortest path P is (r, d)-close to a vertex v if P is r-significant with an r-witness path 
P' such that dist{v, P') < d. 

Note that if P is (r, (i)-close to v, then P is also (r', d')-close to v for any 0 < r' < r and 0 < d < d'. 
Let the r-neighborhood of u, denoted by Sr{v), be the set of all P G Vr that are (r, 2r)-cfose to v. 
Given a set of paths V, we say that H C V is a hitting set for V if every path in V contains a vertex in H. 

Definition 2.6. A network {G,tj has highway dimension (HD) h if h is the smallest integer such that for 
any r > 0 and any v GV , there exists a hitting set H for Sr(v) (that depends on v and r) with \H\ < h. 

Given r > 0 and v G V, we define the ball of radius r centered at v, Br{v), to be the set of all vertices 
within distance at most r from v. 

A notion related to highway dimension is that of a sparse shortest-path hitting set (SPHS). 

Definition 2.7. For r > 0, an (/i, r)-SPHS is a hitting setCCV for Vr such that Vu G V, |P 2 r-(w)nC'| < 
h. 

Abraham et al. mum show: 

Theorem 2.8. If the highway dimension of a network {G,£) is h, then (1) for any r > 0, o minimum 
hitting set for Pr is an {h,r)-SPHS and (2) If shortest paths are unique one can find an {h\ogh,r)-SPIIS 
in polynomial time. 
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3 HHL and Highway Dimension 

Abraham et al. 13E1D] show that a network with HD h and diameter D has an HL with \L{v) \ = 0{h log D), 
and that in polynomial time one can hnd an HL with |L(f)| = 0{h\oghlog D). We show similar results 
for HHL. 

Assume that edge lengths are at least 1 and let D be the diameter of the network {G,£). A multiscale 
SPHS of {G,£) is a collection of sets Q for 0 < * < [logD], where each Gi is a (/i, 2*“^)-SPHS. In 
particular, note that Go = V, since every vertex is an (l/2)-significant path. For 0 < j < [logD], let 

Theorem 3.1. A network with HD h and diameter D has an HHL with |L(u)| = 0(h\ogD) for all v G V, 
and if shortest paths are unique one can find in polynomial time an HHL with |L(u)| = 0(hlog/ilogI?). 

Proof. Consider the ordering r such that for i < j each w G Qi is less important than each v G Qj (i.e. 
r{w) < r{v)), and vertices within each Qi are ordered arbitrarily. For each v G Qi, define 


L(v) = {w} U {r(w) > r(v), w G Gj D i?2j(u)}. 


Consider a shortest s-t path P and let i be such that 2*“^ < i{P) < 2L Assume, w.l.g., that r(s) < r{t). 
Let s G Qx and t G Qy] we have x <y. 

iiy>i, then t G so t G Qy P B 2 v{s) C n i? 2 »(s) and therefore t G L{s). If j/ < i, then since 

X < y < i there must be a vertex w s,t such that w G P Ct Gi. By the definition of i, w G H2’ (s) and 
w G B 2 i{t). Therefore w G L{s) C L{t). In both cases, the cover property holds. 

Using the multiscale SPHS provided by Theorem |2.8| we get that there exists an HHL such that 
|L(u)| = 0{h\ogD) and if shortest paths are unique we can compute in polynomial time an HHL such 
that \L{v)\ = 0{h\ogh\ogD). □ 


Next we discuss the distance greedy d-HHL algorithm (dehned in Section 2.3). 


Theorem 3.2. In a network with HD h and diameter D d-HHL finds a labeling with |L(u)| = 0{hlogn\og D), 
for all V G V. 


Proof. We show that for every vertex v and level i, Liv) contains 0{h\ogn) hubs at level i. 

Consider the (consecutive) iterations of the algorithm that select vertices at level i. Consider v G V 
and B2.2'{x). Since d-HHL already covered all vertex pairs of level greater than i, v can accumulate hubs 
of level i only from vertices in B2.2'{x). 

Suppose at some step the algorithm chooses a level i vertex w in i? 2 - 2 ‘(^)- Every x-y shortest path of 
length > 2® hit by w is in S 2 i{v). By the definition of highway dimension, there is a hitting set H for 
S 2 i{v) with \H\ < h. 

We call a yet uncovered vertex pair [x, y] relevant if there is a x-y shortest path in 62^ (v) and 
dist(x, 2 /) > 2L Since iJ is a hitting set for 52i(u), H is also a hitting set for the set of relevant vertex 
pairs (it hits a shortest path between each such pair). It follows that there is a vertex u G H which covers 
at least 1 /h relevant vertex pairs. By the greedy choice of w, w hits at least the same number of relevant 
pairs as u does. 

After h consecutive vertices from B2.2' (v) are selected, the number of relevant vertex pairs is at most 
(1 — 1 /h)^ < 1/e fraction of the original, i.e., is reduced by a factor of e. The initial number relevant 
vertex pairs is bounded by n^, therefore the algorithm chooses 0{h\ogn) vertices in B2.2i{v) before all 
relevant vertex pairs are hit. Once all the relevant vertex pairs are hit, the algorithm will not choose any 
level i vertices in i? 2 - 2 “ • Q 


4 Upper Bounds 

In Sections and we assume that isolated vertices are deleted from the center graphs, so their density 
is the number of edges divided by the number of (non-isolated) vertices. 
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4.1 Greedy 

We show that g-HHL finds an HHL of size that is within an 0{y/nlogn) factor of the optimal HL size. 
We prove this by bounding the ratio of the density of the center graph picked by g-HHL and the density 
of the MDS of a center graph. 

Theorem 4.1. g-HHL is an 0{y/ri log n)-approximation algorithm for HL. 

Proof. Suppose that at some iteration, the algorithm picks a center graph with m' arcs and n' vertices. 
Then by the definition of g-HHL all center graphs have at most m! arcs, so the density of the maximum 
density subgraph (over all center graphs) is at most y/mi. The density ratio of the maximum density 
subgraph to that of the chosen center graph is at most 



m! jn' 


rf n! rz —7 rz — 

. < — , = v2n' < v2n. 

vW VnV2 


Here we use the fact that the chosen graph has no isolated vertices, so m! > n'/2. It follows that the 
density of the chosen center graph is a VSn-approximation of the maximum density of any subgraph. By 
Lemma 2.1 we have that the labeling size is larger than the size of the optimal HL by at most 0{y/ri\ogn) 
factor. □ 


Since HHL is a special case of HL we have 
Corollary 4.2. g-HHL is an 0{y/ri log n)-approximation algorithm for HHh. 


4.2 Distance Greedy 

We show that d-HHL finds an HHL of size within an 0{^/n log n log D) factor of the optimal HL size. 
But first we need to extend our concept of hub labels. 

Cohen et al. [5] defined a more general notion of hub labels for a given set U of vertex pairs. Such 
labels are required to have a vertex w € L(u) H L{v) which is on a shortest path between u and v for each 
[u,v] € U. The O(logn) approximation algorithm described in Section 2.1 works for this more general 
notion of HL; Lemma [24] and Theorem |4.1|hold. 


Theorem 4.3. d-HHL is an O{yHilog n log D)-approximation algorithm for HL. 

Proof. Let OPT denote the size of the optimal HL. Let Ui be a set of vertex pairs at level i which are 
not covered by vertices at higher levels when we run d-HHL. Let HL^ be the optimal HL to cover vertex 
pairs from Ui and let OPT^ be size of HL^. Since Ui is a subset of all vertex pairs, OPT^ doesn’t exceed 
we can use the g-HHL algorithm to find 0{yHilogn) approximation for HL^. 


4.1 


OPT. By Theorem 

Now let’s return to d-HHL. Since every two pairs at the same level have the same weight and weights 
of all lower-level vertex pairs are negligible, at the consecutive set of iterations in which d-HHL covers Ui 
it picks the same vertices as g-HHL when we run it on Ui. 

So the labels found by d-HHL have size 


[log D\ [log D\ 

0{y/nlogn)OPTi < 0(-\/nlogn)OPT = 0(-ynlognlogil)OPT. 

i—0 i—0 

□ 

Corollary 4.4. d-HHL is an 0{y/nlognlog D)-approximation algorithm for HHL. 


4.3 Weighted Greedy 

Although w-HHL is motivated by the approximation algorithm of Cohen et al., it does not achieve 
O(logn) approximation. We show that w-HHL finds an HHL of size larger than the size of the optimal 
HL by an 0{^/nlogn) factor. The key to the analysis is the following lemma. 

Lemma 4.5. If G(V,E) is a graph with no isolated vertices, then G is an 0{y/n)-approximation of the 
maximum density subgraph of G. 
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Proof. Consider a subgraph {V,E') of G. Let \V\ = n, \E\ = m, \V'\ = n', \E'\ = m'. Then 


/ • / / 2 \ / . /\ / t — 

m < minfm, n ) = n min —, n < n y/m. 

\n' / 


where the last step follows since if n' < ^/rn, min = n' < y/m, and if n' > y/m, min = 

™ < y/fn- 

n' — V 

Since G goes not have isolated vertices, m > n/2, so we have 


m ,— m n m n 

— < y/m = -^ < 

rv 


III J —— 

,_< —V^. 

n y/m n y/n/2 n 


□ 


Theorem 4.6. w-HHL is an 0{y/n log n)-approximation algorithm for HL. 


Proof At each iteration, w-HHL picks the center graph with the maximum ratio of the number of edges 
divided by the number of vertices. By Lemma |4.5| the density of this graph is smaller than the density of 
the densest subgraph of a center graph by at most 0{y/n). Therefore by Lemma 
HHL of size within an 0{y/ri\ogn) factor of the size of the optimal HL. 


2.1 


w-HHL produces an 

□ 


Corollary 4.7. w-HHL is an 0{y/nlogn)-approximation algorithm for HHL. 


5 Lower Bounds 

In this section we show that g-HHL, d-HHL and w-HHL do not give a poly-log approximation. We present 
graphs for which these algorithms find a labeling worse than the optimal HHL by a polynomial factor. 
We also show that our upper bounds are fairly tight. 


5.1 Greedy 


We show that for a graph in Figure la g-HHL finds a labeling larger by a factor of Ll{y/n) than the 


optimal HHL (and therefore the optimal HL). 

Lemma 5.1. There is a graph family for which g-HHL finds HHL of size while the optimal 

HHL size is 0{n). 


Proof Consider the directed graph G = {V,A) in Figure la The graph G has n = 0(fc^) vertices 
V = {oi,..., Ofc, 1)1,..., 6fe+i} U {cij I 1 < i < fc -I- 1,1 < j < fc}. The arcs are A = {{oi, bj) \ 1 < i < 
/c, 1 < j < fc + 1} U {(l)i, Cij) I 1 < i < A: + 1,1 < j < /c} all of length 1. 

Consider the center graphs when g-HHL starts and the labeling is empty. Shortest paths containing 


Oi include the path from Oi to itself, the paths from Ui to bx 
edges in the center graph of at is 


and the paths from at to Cxy, so number of 


l + (k + l) + k{k + l) = {k + l)‘^ + 1 . 


Shortest paths containing bi include the path from bi to itself, k paths from Oj to bi, another k paths 
from bi to Cij, and the k^ paths from Oj to for a total of 

1 -f fc -I- /c + = (fc -I- 1)^ . 


Shortest paths containing include the path from Cij to itself, from Cij to bi, and the fc paths from Cij 
to Ox for a total of 

l-l-l-t-fc = fc-t-2 . 

So g-HHL will pick an Oi vertex for some i first. Note that if when g-HHL picks an Oi vertex, the center 
graph of Oj, j ^ i does not change, and the center graphs of the bfs and the Cij’s loose edges. Therefore 
g-HHL will continue picking a-vertices until there are none left. After that, a center graph of some bi has 
fc -I- 1 edges and a center graphs of Cij has 2 edges. So g-HHL will pick all b vertices next, and then all 
the c vertices. 
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(a) Bad example for g-HHL and d-HHL. 



Figure 1: Bad examples for greedy HHL algorihtms. 


The order found by g-HHL is Oi,..., a^, 6i..., followed by c vertices and the labeling it produced 
is as follows. \Lf{ai)\ = \Lb{ai)\ = 1, |L/(&i)| = l + k, Lbih) = l + k, \Lf{cij)\ = 1, and |Lb(cy)| =2 + k. 
Therefore the total size of the labeling is D,{k^) = 

A better order for this graph is bi,..., bk+i,ai ,..., Ofc followed by c vertices. The canonical labeling 
corresponding to this order is as follows. |L/(ai)| = (/c -I- 1) -I- 1, \Lb{ai)\ = 1, \Lf{bi)\ = \Lb{bi)\ = 1, 
\Lf{cij)\ = 1, and \Lb{cij)\ = 2. The total size of this labeling is 0{k‘^) = 0{n). □ 


We have shown that for G = {V,E), g-HHL produces a labeling larger than the optimal one by a n 
n{y/n) factor, so our 0{y/rilogn) upper bound on the approximation ratio of g-HHL of Section 
tight up to a logarithmic factor. 


4.1 


IS 


5.2 Distance Greedy 


We show that for a graph in Figure la d-HHL finds a labeling larger by a factor of n{y/n) than the 


optimal HHL (and therefore the optimal HL). 

Lemma 5.2. There is a graph family for which d-HHL finds HHL of size while the optimal 

HHL size is 0{n). 


Proof. Consider the directed graph G = (V, A) depicted in Figure la There are paths of length 0, 1, and 


2. While there are some paths of length 2 yet uncovered, d-HHL selects a vertex to hit the maximum 
number of paths with length 2. Weights of all paths of length 0 and 1 matter only when d-HHL chooses 
between two vertices which hit exactly the same number of paths of length 2. 

At the beginning Oi hits k{k 1) paths of length 2 and b^ hits k^ paths of length 2. So d-HHL selects 
Oi- As d-HHL proceeds, the number of paths of length 2 hit by hi decreases and the number of paths of 
length 2 hit by Oi remains the same k{k 1). 

Therefore the order found by d-HHL is Ui,..., a^, 6i ..., followed by all the c vertices. Exactly 
the same order is produced by g-HHL. From Lemma |5.1| we know that the size of the canonical labeling 
of this order is while the size of optimal HHL is 0{n). □ 

So d-HHL can also produce a labeling of size Ll{y/n) away from optimal. This makes a fairly good 
match with the 0{^/nlogn\og D) upper bound established in Section 


4.2 


Theorem 3.2 gives 0(/i log n log I?) bound for the maximum label size produced by d-HHL. The graph 
in Figure I la| gives us a good lower bound on the maximum label size as the following lemma specifies. 

Lemma 5.3. There is a graph family for which d-HHL finds HHL with maximum label size Ll{h\ogD). 


The diameter of G is 2. Let’s find the 


Proof. Consider the directed graph G = {V, A) in Figure la 
highway dimension h oi G. 

Abraham et al. [U Lemma 3.5] show that the maximum degree of a vertex is a lower bound on the 
HD. Thus h is at least fc -|- 1. Note that all hi form a hitting set of size fc -|-1 for all paths of length greater 
than 0. So any Sr{v) has a hitting set with at most k -\-2 vertices and thus h = <d{k). 

In the labels found by d-HHL we have \L{ci)\ = k 
and D — 2 we have \L{ci)\ = Q{hlogD). 


2 (cf. the proof of Lemma 5.1). Since h = 0(/c) 

□ 


Lemma 5.3 shows that the upper bound of Theorem 3.2 is right up to a O(logn) factor. 
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5.3 Weighted Greedy 

We show that for a graph in Figure [l^w-HHL finds a labeling of size larger than the size of the optimal 
HHL by a factor of Q.{-^/n) (and therefore the optimal HL). 

Lemma 5.4. There is a graph family for which w-HHL finds HHL of size while the optimal 

HHL size is 0{n). 

Proof. Consider the undirected graph G = {V, E) in Figure [Tb]. The vertices of G are V = {a, b} U {q | 
I < i < k} U {dij I 1 < z < A, 1 < j < 1}, so n = |H| = Q{kl). The edges are E = {{a,dij) | 1 < z < 
k,l < j < Z} U {(6, Ci) I 1 < z < fc} U {(ci, dij \ 1 < i < k,l < j < 1)} All edges have length 2 except for 
those adjacent to a, which have length 3. The lengths of the edges are set so that shortest paths between 
distinct d vertices adjacent to the same c vertex go through the c vertex. 

We set I = 2kf\ so k = As we shall see, this is large enough to make w-HHL choose the c 

vertices before choosing b. However, this causes the c vertices to be added to the labels of many d vertices 
and leads to a large total label size. 

Consider the center graphs when w-HHL starts and the labeling is empty. Since the graph is connected, 
all center graphs have no isolated vertices, so all the denominators of the densities of the center graphs 
are the same and equal n. 

Consider now the numerators (number of pair covered) by the different vertices. Vertex a covers 
the shortest paths between the d vertices adjacent to different c’s. Therefore the center graph of a has 
H((fcZ)^) edges, which is asymptotically more than the number of edges in the other center graphs. So 
w-HHL chooses a to be the most important vertex. 

Following this first choice of a, all vertex pairs consisting of a and d’s are covered, except for the pairs 
of d’s of the form dij and dir (both adjacent to Ci). The vertex dij is an endpoint of every uncovered 
shortest path containing it and therefore the density of the center graph of dij is constant. As we shall 
see, the density of other center graphs is higher, so the d vertices are chosen last. 

We show that after choosing a, w-HHL chooses c vertices until there are no c vertices left. Suppose 
the number of remaining c vertices is t : 1 < t < k. We show that the density of the center graph of each 
of the remaining c’s is larger than the density of the center graph of b. First we observe that at this point 
the number of vertices in the center graph of b and in the center graph of each of the remaining c vertices 
is the same, namely 1 +t + tl. 

Shortest paths through b include the paths between Ci and Cj for i < j, the shortest paths from Ci to 
djr for i ^ j and paths from b to all the vertices that have not been picked yet. So the number of edges 
in the center graph of b is 

t{t - l)/2 -h t{t -l)l + {l + t + tl) . (1) 

Shortest paths through Ci include the paths between dij and dir for j < r, the paths from dir to Cj for 
j ^ i, the paths from b to dij, and the paths from Ci to all the vertices that have not been picked yet. So 
the number of edges in the center graph of a remaining c vertex is 

l{l — l)/2 -f l{t — l)-|-l-|-(l-|-t-|- tl) . (2) 

Subtracting Equation Q from Equation ([^, and using the facts that I = 2k‘^ and k >t, we get 


l{l -l)/2 +It- t{t - l)/2 - t{t - 1)1 = 
f/2 + 2lt + V2 - 1/2 - tV2 - tH > 

1^/2 - fl = 2k^ - 2k^ > 0 . 

So w-HHL chooses a first, followed by all c vertices, then b and all d vertices. The size of the 
corresponding canonical labeling is 

k 

n + ^(1 +t + tl)Pl + kl = n{lk^) = H(rz4/3). 
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A better ordering is the one which puts a is hrst, followed by b, the c vertices, and finally the d vertices. 
The size of the corresponding canonical labeling is 


n + {1 + k + kl) + k{l + 1) + kl = 0{n). 

□ 

Therefore on the graph in Figure [Tb| the size of the labeling produced by w-HHL is larger than the 
optimal by a factor of There is a factor of ft{-^\ogn) gap between this lower bound and the 

0{^/n\ogn) upper bound of Section [T3| 


6 NP-Completeness 

6.1 Undirected Graphs 

In this section we prove that the problems of finding an optimal HL and an optimal HHL are NP-hard by 
a reduction from Vertex Cover (VC). The reduction takes an instance of VC consisting of a graph G and 
an integer k and produces an undirected graph G' and an integer k' such that the following conditions 
are equivalent 

1. There is an HL of size k' in G'. 

2. There is an HHL of size k' in G'. 

3. There is a VC of size k in G. 


Our results imply NP-completeness of HL and HHL in undirected graphs. 

Before presenting the reduction we prove the following useful lemma. 

Lemma 6.1. Let G = (V, E) he a graph and S be a star graph, distinct from G, with a root s and \V\ 
leaves. Let G' he the union of the graphs G and S, with additional edges between s and some vertices of 
V. If G' is connected then there are optimal HL and HHL for G' such that s € L(x) for every vertex x. 

Proof. Let L be an optimal HL (or HHL) labeling of G'. First, assume that for a leaf u G S we have 
that s ^ L{u). Since (s, u) G G' we must have that u G L{s), and the only pair of vertices covered by 
u G L{s) is [s, m]. So if we add s to L{u) and remove u from L{s), we get a valid labeling of the same size 
as L which is optimal. Therefore we may assume that s G L(u) and u ^ L(s) for every leaf u G S. 

Next, assume that for some v G V, and a leaf u G S we have that u G L{v). Since u G L{v) is used 
only to cover the pair [u, z;], we can remove u from L{v) and add s to L(y) if it is not already there while 
keeping the labeling valid. This way we can transform L, without increasing its size, to a labeling such 
that the labels of z; € V do not contain leaves of S. 

Finally, assume that there is a vertex v G V such that s ^ L{v). Then L{v) n S' = 0. Since the pair 
[it, z;] for every leaf u G S has to be covered, L{u) must contain a vertex of V. Remove vertices of V from 
L{u) for all zz, € S and add s to L{v) for all vertices v G X such that s is not in L{v) already. This keeps 
the labeling valid and cannot increase its size. □ 


Now we describe the reduction. We reduce the problem of deciding whether there is a VC of size at 
most A: in G to the problem of deciding whether there is an HL of size at most k' in a graph G'. Lemma 
6.7 shows that G' has an HL of size at most k' iff it has an HHL of size at most k', so it follows that our 


reduction also proves that deciding whether there is an HHL of a given size is also NP-complete. We 
construct G' = {V,E') from G = {V,E) as follows. 


1. For each vertex v G V we add three vertices, vi, V 2 , and to V' and two edges {zzi, ^ 2 } and {z; 2 , zzs} 
to E\ 

2. For each edge {zz,z;} G E we add an edge {zzi,z;i} to E'. 

3. We add a star S with 3| V| leaves and a root s to G' and add {s, zzi} to E' for every v gV. 


The graph G' is shown in Figure 2a All edges have length 1. By Lemma 6.1 we can assume w.l.g. 
that in an optimal labeling all vertices have s in their labels. Therefore, all paths between and Uj such 
that {zz, v} ^ E are covered (hereinafter when we write Vi we mean Vi for z = 1,2, 3). 
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Figure 2: Reduction from VC to HL 


For each vertex v G V we have a subgraph G'^ in G' which is a path {vi,V 2 ,V 3 ). Any labeling must 
cover all \vi,Vj\ pairs of G'^. We show that w.l.g the labeling covers these paths either as in Figure 2b in 


which case we say that u is a type 1 vertex or as in Figure in which case we say that u is a type 2 
vertex. Note that a type 2 vertex uses one more hub in the labeling, so to reduce the labeling size we 
want to minimize the number of type 2 vertices. We will show, however, that the type 2 vertices must 
form a vertex cover for the labeling to be valid. 


Lemma 6.2. There is an optimal labeling L of G' such that for each vertex v gV if vi G L{v 2 ), then v 
is a type 2 vertex, otherwise v is a type 1 vertex. 


Proof. Vertex G L{v 2 ) can cover only the pair [^ 2 ,^ 3 ]. So if V 3 G L(v 2 ) we can remove V 3 from L{v 2 ) 
and put V 2 in L{v 3 ) instead. Similarly M 3 G L{vi) can cover only the pair [mi,M 3 ]. So if M 3 G L{vi) we can 
remove M 3 from L{vi) and put Mi in L^v^) instead. Now if mi G L(m 3 ) then we don’t need M 2 G L{vi) and 
can replace M 2 G L(mi) by mi G L{v 2 ) keeping L optimal and making v a type 2 vertex. 

If Ml ^ L(m 3 ) then we have M 2 G L(vi) to cover the pair [mi,M 3 ]. So either m is a type 1 or there an 

additional hub mi G L{v 2 ). In the latter case we can remove M 2 from L{vi) and put mi into L^vs), making 
M a type 2 vertex. □ 

For an edge {u, v} G E let be the subgraph of G' corresponding to this edge as shown in Figure 
G„„ contains all shortest paths between Vi and Uj. Note that no vertex of G' other than Ui and Vj hits 

these paths. We say that a hub Ui G L{vj) or Vi G L(uj) is a {m, Mj-crossm^ hub. 

Lemma 6.3. If there is an edge {m,m} G E then the labels of Ui,Vi, 1 < i < 3 contain at least 3 
{m, v}-crossings. 

Proof. Consider three pairs: [ui,mi], [m 2 ,M 2 ] and [u 3 ,M 3 ]. To cover each [ui,Vi] pair for z = 1,2,3 we need 
a {m, M}-crossing hub. So L(ui) U L{vi) contains a {u, M}-crossing hub. Since all three L{ui) U L{vi) are 
disjoint, L has at least 3 {u, M}-crossing hubs. □ 


The following lemma shows that the type 2 vertices must form a VC. 

Lemma 6.4. There is an optimal labeling L for G' such that for each edge {u,m} G E there is at least 
one type 2 vertex among u and v. 

Proof. By Lemma |6.2| we can assume that every vertex is either a type 1 or a type 2 vertex in L. 

Suppose {m, m} G E and both u and v are type 1 vertices. A partial labeling is shown in Figure 
Since Ui ^ L(u 2 ), Ui cannot cover the pair [mi,U 2 ]- Similarly, mi cannot cover the pair [ui,M 2 ] and neither 
Ml nor Ml can cover the pair [m2,M2]. With one more hub to cover the pair [mi,mi] it follows that we need 


11 































at least 4 different {m, r;}-crossing hubs already. Let U 2 G L{v 2 ) (the case V 2 & L{u 2 ) is similar). Then 
we need one more {it, ril-crossing hub to cover the pair [it 2 ,D 3 ] and the total number of hubs to cover 
shortest paths in is at least 9. 

If we make v a type 2 vertex then 8 hubs suffice as shown in Figure We didn’t change other hubs 
in labels of Ui so all [ui^Wj] pairs for w ^ v remain covered. Also all [vi,Wj\ pairs remain covered, since 
V 2 can’t be used as a hub for any [vi,Wj] pair. □ 


The following lemma gives a reduction from VC to HL. 


Lemma 6.5. The graph G has a VC of size k if and only if G' has an HL of size 12|V| + 1 + 3|i?| + k. 

Proof. Assume G has a vertex cover of size at most k. We construct an HL of G' as follows. We put s 
and V itself in L{v) for every v G V'. Since there are 6|V| + 1 vertices in G' this contributes 12|V| + 1 
hubs. Then we make each vertex of the vertex cover a type 2 vertex and each vertex which is not in the 
vertex cover a type 1 vertex. We use 2 hubs to cover for a type 1 vertex and 3 hubs for a type 2 
vertex, for the total of 2|V| + k hubs. For each edge {u,v} G E we use 3 (it, u{-crossing hubs to cover 
G(j„ as shown in Figureand FigureSo the total labeling size is 12|V| -I- 1 -|- 3|£'| -|- k. 

Now assume that L is an optimal HL of G' of size 12| V| + 1 + 3|i?| + k. By Lemma 6.1 we know that 
any vertex w G G' has s in its label and by Lemma |6.2| we know that there exists such an L that makes 
every vertex v G V either a type 1 or a type 2 vertex. By Lemma |6.3| we know that there are at least 3 
{u, uj-crossing hubs for any edge {u, u} G E. Since the size of L is at most 12| V| -|- 1 -|- 3|i5| -|- fc it follows 
that there are at most k type 2 vertices in L. Lemma |6.4| implies that these k vertices form a vertex 
cover. □ 


Theorem 6.6. The problem of deciding whether an undirected graph has an HL of size at most k is 
NP-complete. 


The following lemma shows that our reduction is in fact also a valid reduction from VC to finding an 
optimal HHL. 

Lemma 6.7. The graph G' has an HL of size k' if and only if it has an HHL of size k'. 


Proof. The “if” part follows from the fact that every HHL is an HL. For the “only if” part consider an 


optimal HL L of size at most k'. By Lemma 6.2 each vertex is either of type 1 or of type 2. Consider the 
following order of the vertices of G'. The most important vertex is s followed by all the leaves of S. Then 
we put the triple Vi,V 2 ,V 3 for all type 2 vertices where for each u, Ui is more important than V 2 which 
is more important than vs and the order of the triples corresponding to different vertices is arbitrary. 
Finally put a triple V2, vi, V3 for all type 1 vertices where for each v, V2 is more important than vi which 
is more important than V 3 and the order of the triples corresponding to different vertices is arbitrary. 
The labels in Figure ^ and Figure [ 2 ^ respect this order. Thus the H HL L corresponding to this order 


has exactly 3 {it, wj-crossings for each {u, u} € E. Therefore by Lemma 


6.3 


L has the same size as L. □ 


Lemma |6.7| and Lemma |6.5| immediately imply the following 


Theorem 6.8. The problem of deciding whether an undirected graph has an HHL of size at most k is 
NP-complete. 


Theorem |6.6| and Theorem |6.8| show that both HL and HHL are NP-Complete in undirected graphs 
with unit lengths. If we change length of edges {s,!;!} for u € V from 1 to 0.9 our proof is not affected. 
However, the shortest paths in G' become unique. So HL and HHL are NP-Complete in undirected 
graphs even when shortest paths are unique. 


6.2 Directed Graphs 

Here we show that both optimal HL and HHL are NP-hard in directed graphs. We begin with HHL, for 
which there is a simple reduction from the undirected case. 

Let G be an undirected graph. We transform G to directed graph G' by replacing each edge {u,v} 
with two arcs {u,v) and (u,it). Now we present the reduction. 

Lemma 6.9. The graph G has an HHL of size k if and only if G' has an HHL of size 2k. 
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Figure 3; Optimal HL for directed graph C'^. Solid Figure 4: Gadget corresponding to an edge {u,v} 
and dashed arcs represent forward and backwards in the reduction from VC to HL in a directed graph 
labels respectively. (curly arcs represent labels). 


Proof. To show the “only if” part, we take the labeling L constructed from L as follows Lf{v) := L{v) 
and Li,(v) := L{v). 

Now we show the “if” part. We can assume that Lf,Lb is a canonical labeling (or replace the labeling 
by a smaller canonical one). Since G' is symmetric, from the definition of canonical labeling it follows 
that for any vertex v the forward label has exactly the same hubs as the backward label. Moreover the 
distances from v to and from the hubs are the same. So L defined as L{v) := Lf(v) is a valid labeling for 
G. □ 

Theorem 6.10. The problem of deciding whether a directed graph has an HHL of size at most k is 
NP-complete. 

The following remark implies that the above reduction doesn’t work for HL. 

Remark 6.11. For a directed graph a minimum HL need not he symmetric. 

Proof Consider the 4-cycle graph G^ = {V,E), V = {vq, Vi, V 2 , V 3 }, E = {{v^^Vi+i mod 4 } I 0 < * < 3} 
and the corresponding directed graph G'^. An HL L of size 16 for C 4 is shown in Figure]^ (for example 
Lf{vo) contains vq and V 3 and Lb{vo) contains vq and ?;i). Note that it is not symmetric as for example 
Lf{vo) ^ Lb{vo). Any labeling in G'^ satisfying Lf = Lb correspond to a labeling in G 4 of half the size. 
So in order to show that there is no symmetric labeling of G 4 of size 16 we show that there is no labeling 
of C 4 of size at most 8 . Indeed we need 4 hubs to cover the pairs Vi\ and 4 hubs to cover the pairs 
[vi, Vi+i mod 4 ]. This already counts for 8 hubs. Therefore no Vi is in L{vi ±2 mod 4 )- To cover the [uq, V 2 ] 
pair we need vi (or U 3 , the case is similar) to be in both L(vq) and L{v 2 ) and therefore L{vi) contains 
only vi. But now we have no common hub for the [ui,U 3 ] and therefore it is uncovered. So there is no 
HL of size 8 for C 4 . □ 

Now we present another reduction from VC to HL in a directed graph. For a VC instance G = (V, E) we 
construct an HL instance G' = {V',A'), V' = {ic}U{ui,U 2 | u € V}U{e | e € E}, A’ = {(re,'Ci), (ui,'(; 2 ) | 
f S V} U {(ui, U 2 ), (ui, U 2 ), (it 2 , e), (z; 2 , e) | e = {u,?;} S E}. All arcs have length 1. For each edge 
e = {u, u} from G we have a gadget as shown in Figure]^ (consider only straight arcs). 

For any labeling we have x in both Lf(x) and Lb{x) for all vertices x and either x G Lb{y) or y G Lf{x) 
for all arcs {x,y). Let us call such hubs mandatory and all other hubs non-mandatory. Mandatory hubs 
cover all pairs [x,y\ such that dist(a;, y) < 1. Any labeling for G' has at least AI{G') = 2\V'\ -b \ A'\ 
mandatory hubs. 

Lemma 6.12. The graph G has a VC of size k if and only if G' has an HL of size M{G') k. 

Proof. We claim that mandatory hubs are enough to cover all pairs in G' except [re, e] for e G E, which 
means all pairs [x,y] with dist(a;,y) < 2. The sufficient labeling is shown in Figure]^ by curly arcs: a 
solid curly arc {x,y) means y G Lf(x) and a dashed curly arc {x,y) means y G Lb{x). Indeed, for a pair 
[x, y] with dist(x, y) = 2 we have either x = w or y = e for some e G E. In the former case y = U 2 for 
some u G V and the common hub is ui. In the latter case x = ui for some u G V and either e = {m, u} or 
e = {u, v'} for some neighbor v G V of u. In both cases V 2 is the common hub. 
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Figure 5: Example which separates HL and HHL. 


Since dist(w, e) = 3 for a e G E we need a non-mandatory hub to cover a [w, e] pair. The non¬ 
mandatory hubs correspond to the vertex cover in G. If there is a VC of size fc in G then it is sufficient 
to use exactly k non-mandatory hubs: add V 2 to Lf{w) for every v in VC. 

Suppose there is an HL with at most k non-mandatory hubs. We build a VC of size at most k. For a 
non-mandatory hub e S Lf(w) and any non-mandatory hub in Lb(e) for an edge e = {u,u} S E, add u 
to the VC. For a non-mandatory hub V 2 G Lf{w) for some v G V, add v to the VC. It is easy to see that 
this is indeed the vertex cover. □ 

Theorem 6.13. The problem of deciding whether a directed graph has an HL of size at most k is 
NP-complete. 


7 HL vs. HHL 


In [T3], it is shown that the gap between the size of the optimal HHL and the size of the optimal HL can 
be We show that for a graph in Figure]^ the gap is 

Theorem 7.1. There is a graph family for which the optimal HHL size is times larger than the 

optimal HL size. 

Proof. Consider the undirected graph shown in Figure]^ The graph consists of k distinct stars each with 
k — 1 leaves. The centers of the stars are connected such that they form a clique. Finally, there is an 
additional vertex s connected to the leaves of all stars. The total number of vertices is n = -|-1. The 

length of every edge is 1. 

Consider the following HL for this graph. The vertex s is in every label. A center n of a star S is in 
the labels of all of the vertices of S. Finally, every star-center has every other star-center in its label. It 
is easy to verify that the cover property holds for this labeling. Each leaf u of some star S has a label of 
size 0(1). The label of s is of size 0(1). The size of the label of each star-center is A: -I- 1. It follows that 
the total size of this labeling is 0(n). 

To construct an HHL, we need to order the centers of the stars. Fix such an order. Consider a leaf u 
of some star with center c{u), and let i be the number of star-centers which are more important than 
c{u). For each star-center v that is more important than c(u), {u, c{u),v) is the shortest path between u 
and V, so either v is in L{u) or u is in L(y). This accounts to i hubs in the labels due to the pair u,v. 
The total contribution of such hubs to the size of the labeling is 

k-l 

{k-l)'^i = k{k- lf/2 = H(n3/2). 

2=1 


If follows that the total size of any hierarchical labeling is H(n^/^). This yields an fl(y^) gap between 
the optimal HL and the optimal HHL. □ 

The results of Section imply that the gap in Theorem 7.1 is within O(logn) factor of the best 
possible. 
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8 Concluding Remarks 

Our lower bounds for greedy algorithms show that in contrast with HL the greedy algorithm does not 
give a poly-log approximation for HHL. This motivates the question of whether a poly-log approximation 
algorithm for HHL exists. Our lower bound for w-HHL is rL{-^/n) factor away from the upper bound, 
which leaves the open question to determine the polynomial factor for the w-HHL algorithm approximation 
guarantee. 

On many problem classes g-HHL and w-HHL find labelings of size close to that found by the O(logn)- 
approximation algorithm for HL |10j . It would be interesting to get a theoretical explanation of this 
phenomena, for example by proving a better approximation ratio for g-HHL or w-HHL on natural classes 
of graphs. 
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